Corpus: ces_news_2022_300K

Other corpora

5.1.18 Words nearly always as next neighbors

Strong NN co-occurrences with a low probability of being separated

The quotient below is calculated as freq(word1)*freq(word1)/NN_freq^2.

Word 1 Word 1 Frequency of word 1 Frequency of word 2 Frequency as NN Qoutient
TOP 09 415 369 366 1.14
play off 349 312 306 1.16
pohonných hmot 194 191 188 1.05
Mladé Boleslavi 163 172 139 1.45
Nord Stream 175 149 147 1.21
Premier League 125 138 115 1.30
Čapí hnízdo 120 130 117 1.14
Los Angeles 114 100 100 1.14
MF DNES 122 99 90 1.49
Roland Garros 91 86 86 1.06
Pekarová Adamová 87 80 72 1.34
Elona Muska 45 61 45 1.36
RIA Novosti 56 60 53 1.20
Wall Street Journal 43 45 37 1.41
širým nebem 30 41 30 1.37
Recep Tayyip 39 39 39 1.00
Uherského Hradiště 33 38 29 1.49
oxidu uhličitého 48 38 38 1.26
opičích neštovic 30 37 30 1.23
Maple Leafs 30 32 30 1.07
918 msec needed at 2023-02-15 22:09